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REMARKS 

I. Igftoduction 

la response to the OfBce Actioia dated Miixch U, 2004, no claims have been cancelled, 
amended or added Claims 1-36 remain in the application. Re-examination and re-considerarion of 
the application is requested. 

II. Priot ArrTtejecttnn^ 

A. The Office Action Reiections 

In paragraph (5) of the Office Action, claims 1, 3-10, 12. 13, 15-22, 24, 25» 27-34, and 36 
were rejected under 35 U.S.C. §103(a) as being unpatentable over Viniotis ec al., 'TJnear 
programming . . . Queueing systems," IEEE, 1988 (Vimoris) in view of Schneider et aL, "Stochastic 
Production scheduling . . . demand forecasts," IEEE, 1998 (Schneider). In paragraph (6) of the 
OfEce Action, claims 2, 14, and 26 were rejected under 35 U.S.C. §103(a) as being unpatentable over 
Viniotis in view of Schneider and further in view of Dangat et aL, U.S. Patent No. 5.971,585 
(Dangat). In paragraph (7) of the Office Action, claims 1 1, 23. and 35 were rejected under 35 U.S.C. 
§1 03(a) as being impatentable over Viniotis in view of Schneider and further in view of Hedlund et 
al., "Optircial control of hybrid systtms/* IEEE, 1999 (Hedlund). 
Applicant's attorney respectfully traverses these rejections. 

B. The Applic ant's Invenrion 

Independent claims 1, 13 and 25 are geneially directed to a method for solving, in a 
computer, stochastic control problems of linear systems in high dimensions. Claim 1 is 
representative, and comprises: 

(a) modeling, in die computer, a stnicturcd Markov Decision Process (MDP), wherein a state 
space for the MDP is a polyhedron in a Euclidean space and one or more actions that are feasible in 
a state of the state space arc linearly constrained with respect to the state; and 

(b) building, in the computer, one or more approximations from above and from below to a 
value function for the state using representations that fiadlitate die computation of approximately 
optimal acdons at any given state by linear progiamming. 

C, The Viniotis Reference 
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Viniotis describes linear progtamimng as a technique for optimization of qucucing systems. 
Fot a significant number of queudng models, that appear in diverse:, seemingly uutelated application 
areas, such as routing, resource allocation and flow control, the optimal policy exhibits a certain 
"switching-curv^e" stmciure. In this paper, we formulate the optimal control problem of such models 
in a wiified way, by using abstract Linear Programming. Using well-known facts ficom sensitivity 
analysis of Linear Programs, we show how certain properties of the optimal policy can be easily 
derived, even in cases where Dynamic Programming (DP) and Stochastic Dominance (SD) 
arguments fail A structural property of the optimal value function of the Linear Program, namely 
piecewise linearity, is exploited to derive properties of the optimal cost function. We also consider 
additional problems in the realm of qucueing system control in which DP or SD approaches are not 
applicable but Linear Programming may provide useful results. 

D. The Schneider Referenc e 

Schneider describes stochastic production scheduling to meet demand. Production 
scheduling, the problem of sequentially configuring a fictory to meet forecasttid demands, is a 
critical problem throughout the manufacturing industry. The requirements of maintaining product 
inventories in the face of unpredictable: demand and stochastic factory output make the problem 
difficult. Existing approaches commonly £all into one of two gtoups: either demand forecasts are 
discarded and linearizing assumptions axe made so methods based on optimal control can be 
applied, or Al search mediods axe used to tackle the large search spaces and the abihty to handle 
stochasticity optimally is sacrificed. This paper describes a Markov Decision Process (MDP) 
fortnulation of production scheduling which captures stochasticity, while retaining the ability to 
construct a schedule to meet demand forecasts. The solution to this MDP is a value fiincrion, 
specific to the current demand forecasts, which can be used to generate optimal scheduling decisions 
online. The paper then describes an industrial application and a reinforcement learning method for 
generating an approximate value fimction in thi^i domain. ITae results demonstrate that in both 
dete rmin istic and noisy scenarios, value function approximation is an effective technique. 

E. The Dangat Reference 

Dangat describes a computer impleminited decision support tool serves as a solver to 
generate a best can do (BCD) match between existing assets and demands across multiple 
manufacturing fcidlities within botmdaries established by mantifacturing specifications and process 
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flows and business poHdes to determine which demands can be met in what time firame by 
microelectronics (wafer to card) ot tckted (for example disk drnres) manu^cturing and establishes a 
set of actions or guidelines for manu&cturing to incorporate into dieit manufacturing execution 
system to insvixe die dclivrery commitments ate met in a timely fashion. The BCD tool has six majoi: 
components, a material resource planning explode or ^Tsackwards" component, an optional 
STARTS evaluator component, an opnonal due date for receipts evaluator, an optional capacity 
available versus needed component, an implode "forwatd" or feasible pkn component, and a post 
processing algorithm. 



Th6 Hedlund Reference 
Hedlund describes optimal control of hybrid systems. This paper presents a method for 
optimal control of hybrid systems. An inequality of Bellman type is considered and c^ery solution to 
diis inequality gives a lower bound on die optimal value function. A discretization of this "hybrid 
Bellman inequality" leads to a convex optimization problem in terms of finite-dimensional linear 
programming. From die solution of the discretized problem, a value function that pre-serves die 
lower bound property can be constructed An approximation of the optimal feedback control law is 
given and tried on some examples. 

Applicant's Claims Ar e Patentable Over The RcfcreTices 
Applicants claims ate patentable over the references because rfiey recite a novel and 
nonobvious combination of elements. Specifically, Applicant's claims are patenuble because they 
recite a novel and non-obvious combination of ''displaying,'' "selecting," "mapping" and '"invoking" 
elements. Neither of die references, taken individually ot in any combination, teaches or suggests 
this sequence of steps. 

The Office Action states the following: 



5. Claims 1. 3-10, 12, 13, 15^22, 24, 25, 27-34 and % are rejected under 
35 U.S.C. 103(a) as being unpatentable over Viniotis et aL (VI) ("linear 
programming ... Queueing systems". IEEE, 1988) in view of Schneider et aL (SC) 
("Stochastic Production scheduling ... demand forecasts", IEEE, 1998). 

5.1 VI reaches T . ineat programming as a technique for optimization of 
queuing systems, Spedfically. as per Claim 13, VI teaches solving stochastic control 
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problems of linear systems in high dimensions (Page 652, CLl, Para 1; Page 653, 
CL2, Pa±a 3); compming: 

modeling a structured Matlcov Decision Process (MDP) (Page 652, CLl, Paxa 
4; Page 652, CL2 Para 6), wherein a state space for die MOV is a polyhedron in a 
Euclidean space (Page 654, CL2, J>mma 2); 

one or more actions that are feasible in a state of the state space are linearly 
constrained with respect to the state (Page 653, CLl, Para 1 and Para 2; Page 652, 
CL2, Para 7); and 

building a value function for the state using representations that fecilitate the 
computation of approximately optimal actions at any gwen state by linear 
programming (Page 653, CLl, Para 9 to Page 654, CLl, Para 4; Page 652, CL2, Paia 
8)- 

VI does not expressly teach a computerised apparatus for solving stochastic 
control problems of linear systems in high dimensions comprising a computer. SC 
teaches a computer ed apparatus for solving stochastic control problems of linear 
systems in high dimensions corapnsing a computer (Page 2726, CLl, Pata 3 and 4). 
as that allows the solution of stochastic control problems of linear systems in high 
dimensions run faster and allows the user to generate the results with varying data 
(Page 2726, CLl, Para 3). It would hayc been obvious to one of ordinary skill in the 
art at the time of Applicant's invention to combine the method of VI with the 
apparatus of SC that included a computerized apparatus for solving stochastic 
control problems of linear systems in high dimensions comprising a computer typt, 
as that would allow the solution of stochastic control problems of linear systems in 
high dimensions rim fester and allow the user to generate the results with varying 
data, 

VI does not enpressly teach logic performed by the computer, for modcHng a 
structured Markov Decision Process (MDP). SC teaches logic performed by the 
computer, for modeling a attvictured Markov Decision Process (MDP) (Page 2726, 
CLl, Para 3 and 4), as that allows the solution of stochastic control problems of 
linear systems in high dimensions run faster and allows the user to generate the 
results with varying data (Page 2726, CLl, Para 3). It would have been obvious to 
one of ordinary skiU in the arc at the rime of Applicant's invention to combine the 
mcdiod of VI with tfie apparatus of SC that included logic performed by the 
computer, for modeling a structured Markov Decision Process (MDP), as that would 
allow the sohition of stochastic control problems of linear systems in high 
dimensions run faster and allow the user to generate the results with varying data. 

VI does not expressly teach logic performed by the computer, for building a 
value function for the state using representations that facilitate the computation of 
approximately optimal actions at any given state by Hneax programming. SC teaches 
logic performed by the computer, for building a value function for the state using 
representations that fedlitaie the computation of approximately optimal actions at 
any given state by lineax prc^amming (Page 2726, CLl. Para 3 and 4), as diat allows 
the solution of stochastic control problems of linear systems in high dimensions ran 
fiister and allows the user to generate the results with varying data (Page 2726. CLl, 
Para 3), It would have been obvious to one of ordinary slnll in the art at the time of 
Applicant's invention to combine the method of VI with the apparatus of SC that 
included logic performed by the computer, for building a value function for the state 
using representations that facilitate the computation of approximately optimal 
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actions at any ghren state by linear progtaininir^ as that would allow the solution of 
stochastic control problems of linear systems in high dimensions run faster and allow 
the user to generate the results with varying data. 

VT does not expressly teach logic performed by the computer, for building 
one or more approximations from above and from below to a value fimction for the 
state using representations. SC teaches logic performed by the computer, for building 
one or more approximauons fix>m above and from below to a value function for the 
state using representations (Page 2722, CLI, Para 2; Page 2724, CL2, Para 6). as value 
function approximation is an effective technique for both deterministic and noisy 
scenarios (Page 2722, CLl, Para 2); and approximation allows sohing large scale 
MDPs (Page 2722, CL2, Para 2). It would have been obvious to one of ordinary skill 
in the art at the time of Applicant's invention to combine the method of VI witii the 
apparatus of SC that included logic perfottned by the computer, for building ont: or 
more approximations from above and from below to a value function for the state 
using representations, as value function approximation woxdd be an effective 
technique for both detetmimstic and noisy scenarios and approximation allows 
solving large scale MDPg. 

Applicant's attorney disagrees. Neither t(*ference, taken individuaJly or in combination, 
discloses the specific combination of elements set forth in Applicant's independent rl^itn<= 1, 13 and 
25, 

For example, the Office Action asserts that Viniotis teaches "a state space for the MDP is a 
polyhedron in a Euclidean space," at page 654, CL2, Lemma 2. Howcvta:, at the indicated location, 
Viniotis merely states the following: 

Lemma 2: If A is a totally unimodulax macox, die extreme points of the 
polyhedron {y: Ay < b}, where the vector b is integer-valued, are vectors with 
integer components. 

However, in Viniotis, A is a constraint matrix, not a state space. Moreover, Viniotis does 
not refer to a polyhedron in a Euclidean space. 

In another example, the Office Action asserts that Viniotis teaches "one or more actions 
that are feasible in a state of the state space are linearly conscraincd with respecc to the state," at page 
653, CLl, Para 1 and Para 2; Page 652, CL2, Paia 7. However, at the indicated bcations, Viniotis 
merely states the following: 

Viniotis: page 653. C Ll. Para 1 and 2 

Thus, any linear cost functional that involves the state (e.g., delay), is linear in 
the controls 2^. Selecting an optimal poKcy, therefore, reduces to minimizing a linear 
functional; this minimization is constrained, since the states generated by the policy 
have to belong to the state space S, a possibly unbounded) subset of the 
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nonncgative integers. From the state equatioo, the constraints are also lifif^i- in the 
controL Bur miniTn^Tadon of a linear functional over a linear constraint set is the 
subject of linear Programming. 

There are some points that need artennon. In a T.ine.ar Prograni, the control 
variables are allowed to take values in a continuum, e.g., [0,1] or IR". In (an 
unconstrained) MDP problem, the controls are in reger- valued. For example, in 
resource aHocanon problems, where there are N+1 distinct actions available, Zj^ e 
{0,1, . . . , N}. Thus when reformulating the problem as a Linear Program, we in fact 
"enlarge" the solution space. This will not be a problem if existence of integtb>valued 
optimal solutions is shown, 

Viniotis: page 652. CL2. Para 7 

In the next secdon we briefly present the technicahties of the formulation of 
the MDP problem as a linear program j we use the notation developed in [7], The 
reader may find the missing details in [7*1 4] • 

In reviewing the above, it can be seen that Viniotis teaches only that a linear cost funcrionai 
that involves the state is linear. However, these portions of Viniotis do not teach or suggest that 
actions that are feasible in a state of the state space are linearly constrained with respect to the state, 
in the context where a state space for the MDP is a polyhedron in a Euclidean space- 
In another example, the Office Action asserts that Viniotis teaches ''building a value 
function for die state using representations that facilitate the computation of approximately optimal 
actions at any given state by linear prograrmning," at page 653, CLl, Para 9 to Page 654, CLl, Paxa 
4; Page 652, CL2, Para 8. Applicant's attorney notes that the recitation of the Uraitations are 
incorrect. Moreover, at the indicated locations, Viniotis merely states the following: 

Viniotis: page 653. CLl. Para 9 to page 654. CLl. Pata 4 
Let Z be the set of all admissible policies; let be the subset of policies in Z 
that are integer-valued. Define the (3-dis counted, finite horizon, expected cost of 
policy z, when the system starts &om state x at time k ~ 0, and is allowed to ''move" 
for n steps (i.e., perform n transitions), as 
(Eqn.(5)) 

where L(:c]J is a liaear fijnction of the state trajectory and the control process 
z; it has the interpretation of an instantaneous cost. A fairly general form for L, that 
fits our piirposes is 

(Eqn.(6) 

where c, d are properly dimensioned vector constants. In resource allocation 
problems, where delay is the cost, we have d = 0; in pure blocking systems, we 
choose c = 0. 

To show the exact dependence of J„(z, z) on x and z, let us rewrite (4) as 
(Eqn.(7)) 

Then since x is constant and (Eqn.), where p denotes die probabihty 
disTTihuaon on Q*", we have 
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Eqvmtioa (8) stresses the feet that the cost fiincdort is linear in the variables 
Zj-fw"^.* The dependence of the cost on the probability distribudon, the transitions 
and the constants c,d is "hidden" in Vk(w^, to emphasize the dependence of the cost 
on the policy z. The exact £oim of v^(w^) can be routinely detennined for the specific 
problem in hand [14]. We need only tnention diat YkCw"*) is independent itom the 
control policy a: and the inirial state x. For the purposes of the discussion in this 
section, the exact form of YtC'^ is itxdevant 

From (8) we see that the optimal policy is the one that minicaizes the second 
term in the tight hand side. From (7) the constraints fall in general into two 
categories: 

(a) nonnegarivity of states, namely 
(Eqn.(9)) 

(b) boundedness of states, namely 
(Eqn,(lO)) 

where U is the bound. Since the constraints in (10) (<) are easily converted 
into constraints as in (9), we shall concentrate on consttaints of the form (9) only. 
Smnmarizing, the LP equivalent problem may take the form 
min e2 (p) 
AZ<b 

Thjs form is stutablc to present results firom sensitivity analysis. 

Remark. The cpatrol variables are (Eqn.), and thus there is only a finite 
number of them. The constraint matrix A has elements that depend only on the 
transitions ?k(w^. The vector b dtpends only on the initial state 2. 

We have allowed z^,(w^ to take values in [0,1]. For sensin.vity analy^, x, the 
initial state of the queueing system, should be also continuously-valued In this case, 
the trajectory i will be continuously-valued; such a trajectory does not of course 
correspond to a teal queueing system. 

If, however, x,2^(w') are restricted to take integer-valued values only, then i 
will be integer-valued; in this case it does represent the evolution of the queueing 
system. The optimal cost function of the MDP in this case is given by* 

(Eqn.(ll)) 

This is actually a problem in Integer Programming, the sensitivity analysis of 
which is not as well developed as daac of a linear PtogLara. If we remove the 
restriction on integer-valued policies (and states), we have the above mentioned 
Linear Programming problem (P). Let 

(Eqn.{12)) 

denote the optimal vahie fimction of probUnn (P). It is W„(^) fox which 
results from sensitivity analysis apply. We wish to emphasise here that the functions 
W„, V„ are quite different; first of all, they are even defined on different domains. If 
we can make, however, a suitable connection between them, then we can relate the 
properties of W„ (which we shall determine) to those of V„ (which we want). 

Such a connection is indeed possible, if the linear Program in (12) admits an 
integer-valued solution. In this case, for integer-valued x, (1 1) and (12) refer to the 
same problem. The optimal value function of the UP "contains" in some sense the 
optimal value of die MDP: we can recover V„(x) by "interpolating" W„(x) at the 
integer-valued points of its domain. Consequently, aU the properties of W„(x) arc 
automatically properties of V„(x) as well. 
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Vmiotis: page 652. Pflxa 8 

Buefly, rhe procedure is as follows. From equarion (1) (or (2)) the scate is a 
linear function of tbc control actions z^. 

In reviewing the above^ it can be seen that Viniocis teaches only the formulation of an MDP 
and the definition of a value fiincrion. However, the tcidicated locations in Vinioris cannot be 
interpreted as teaching the limitations of Applicant's claim directed "biiilding approximations from 
above and from below to a value function for the state using representations that fiidlitate the 
computation of approximately optimal actions at any given state by linear programming" (which 
differs from the recitation of die limitation found in the Office Action). 

In another example, the OfEce Action asserts that Schneider teaches '^building a value 
function fox the state using representations that facilitate the computation of approximately optimal 
actions at any given state by linear prognunming," at page 2726, CLl, Para 3 and 4. Further, the 
Office Action states that Schneider teaches ''building one or more approximations from above and 
from below to a value function for the state using representations," at page 2722, CLl, Para 2 and 
page 2724, CL2, Para 6. Again* Applicant*s attorney notes that the recitation of the limitations are 
incottea. Moreover, at the indicated locations, Schneider merely states the following: 

Schneider: pape 2726. CJ A. Para 3 and 4 

Our experiments conisider bodi deterministic and noisy versions of the 
problem. To build the deteiminisdc version of the problem, we ran long (stochastic) 
simulations for each of the 421 actions and cached the mean observed production 
rate for each. For the noisy versions, we could have used noisy outcomes direcdy 
from the stochastic simulation, but instead we simply added Gaussian noise to the 
cached, deterministic production rates. This enabled oiu: experiments to run 
significantly fester, and also allowed us to easily generate empirical results with 
varying amounts of noise. 

Table 1 shows experimental results. The computation rimes reported are on a 
200 MHz Pentium Pro. The first section contains results for the case where the 
fectory output is deterministic and Icnown. The purpose of the first two lines is to 
delimit the range of results we should expect from good algorithms. The "Random" 
algorithm builds a schedule by choosing 8 configurations at random, and it loses an 
enormous amount of money. Much of the cost is due to heuristic penalties for failing 
to satisfy customer demand. 

Schneider pag e '2J'>7^ n 1 . Pf^r^ ? 

In this paper, we describe a Markov Decision Process (MDP) formtUation of 
production scheduling which captures stochastidcy, while letainiiig the ability to 
construct a schedule to meet demand forecasts. The sokition to this MDP is a value 
function, specific to the currenr demand forecasts, which can be used to generate 
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opdmal scheduling decisions onliae. We dien describe an indusmal application and a 
reinforcement learning mediod for genctating axi appioximate value funcrion in this 
doimiQ. Oui resvilts demonstrate that in both detecninistic and noisy scenarios, 
value foncrion approximation is an effective technique. 

5;rhfieider: page 2724. CL2. Para 6 

Here we describe a principled approach to generatiog closed-loop production 
scheduling policies vtrith reinforcement learning methods. It combines the capabilities 
of both optimal control and AI search based methods. The approach is based on 
represemiog the problem as an MDP and representing the soludon sts an 
approximate value function. In contrast to many optimal control based methods, it 
prodoces a time-dependent policy specifically built to match current demand 
forecasts, rather than a time-invariaiit policy that ignores all demand informauon 
other than the current rate. Otir experiments also demonstcate die abiliry to search 
hundreds of alternative factory configurations. 

In reviewing the above, it can be seen that Schneider teaches only a Markov Decision 
Process (MDP) formula tLon of production scheduling which captures stochasticity, wherein the 
sohition to the MDP is an approximate valxie function, specific to the current demand forecasts, 
which can be used to generate optimal scheduling decisions online. However, the indicated 
locations in Schneider cannot be interpreted as teaching '^building approximations from above and 
£com below to a value function for the state xising rcptesentarions that facilitate the computation of 
approximately optimal actions at any given state by linear programming'* (which differs fi:om the 
recitation of the limitation found in the Office Action). 

Dangat and Hedlund fail to overcome these defidendes in die combination of Viniotis and 
Schneider. RecaB that Dangat and Hedlund were cited only against the dependent claims. 

The various elements of Applicant's claimed invention togetlier provide operational 
advantages over Viniotis, Schneider, Dangat, and Hedlund. In addition. Applicant's invention solves 
problems not recognized by Viniotis, Schneider, Dangat, or Hedlund. 

Tlius, Applicant submits that independent claims 1, 13, and 25 are allowable over Viniotis, 
Schneider, Dangat, and Hedlund. Furdier. dependent claims 2-12, 14-24, and 26-36 are submitted 
to be allowable over Viniotis, Schneider, Dangat. and Hedlund in the same manner, because diey are 
dependent on independent claims 1,13, and 25, respectively, and thus contain all the Kmications of 
the independent claims. In addition, dependent claims 2-12, 14-24, and 26-36 recite additional novel 
elements not shown by Viniotis, Schneider, Dangat, or HedKmd. 
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HI. Conclusion 

In view of tibe above, it is submitted that diis application is now in good order for alloNtfance 
and snch allowance is respectfully solicited. Should the Examiner hehavt minor matters st31 rcrcain that 
can be resolved in a telephone interview, the Examiner is urged to call Applicant's undersigned 
attorney. 

Respectfully subnutt^d, 

GATES & COOPEll LLP 
Attorneys for Applicant 

Howard Hughes CenteK 
6701 Center Drive West. Suite 1050 
Los Angeles, California 90045 
(310) 641;;|797 

Date: June 11.2004 
GHG/ 




Reg. No.: 33,500 
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